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METHOD AND APPARATUS IN TRANSMISSION OF IMAGES 



1. Patent proposal on Independent coding of 
Regions of Interest. 

2. technical FIELD 

The present invention relates to a method and a device for coding and extraction of regions of 
interest (ROI) in transmission of still images and video. The method and the device are 
particiilarly well suited for transform based coders like Wavelets and DCT. 

3. BACKGROUND OF THE INVENTION AND PRIOR ART 

In transmission of digitized still images from a transmitter to a receiver, die image is usuaUy 
coded in order to reduce the amount of bits required for transmitting the image. 
The reason for reducing die amount of bits is usually diat die capacity of the channel used is 
limited A digitized image, however, consists of a very large number of bits. ^Vhen transnrutting 
such an image consisting of a very large number of bits over a channel which has a lirmted 
bandwiddi transmission times for most applications become unacceptably long, if every bit of die 
image has to be transmitted. 

Therefore, much research efforts in recent years have concerned coding mediods and techniques 
for digitized images, aiming at reducing the number of bits necessary to transmit. 

These methods can be divided into two groups: 

Lossless mediods, Le. mediods exploiting die redundancy in die image in such a manner diat die 
image can be reconstructed by the receiver without any loss of information. 
Lossy mediods, i.e. mediods exploiting die fact diat aU bits are not equally important to die 
receiver, hence die received image is not identical to die original, but looks, e.g. for die human 
eye, sufficiendy alike the original image. 

4. Problem area 

In some applications parts of a transmitted image may be more interesting dian die rest of die 
image and a better visual quality of diese parts of die image is dierefore desired. Such a part is 
usually termed region of interest (ROI). An appUcation in wUch diis can be useful is for example 
medical data bases or transferring of sateUte images. In some cases it is also desired or required 
that die region of interest is transmitted lossless, while die quahty of die rest of die image is of 
less importance. There is also cases where it is required diat die regions of interest is extracted 
and from the bit stream and decoded without having to decode the whole image. 

5. Best mode of the invention 

Below die mediod for wavelet based coders is described. However, die mediod is equally appUed 
to DCT- based scehemes. 

5.1 Introduction , u . r 

The basic idea is, when die transformation of die image is done, to use a mask m die transtorm 
domain diat describes what coefficients in die transform domain are needed for reconsrction of 
regions of interest and die background in order to classify die transform coefficients mto 
segments. The mask can be created by for example use die -^.^-^ ""or^^^P 0 8 9 8 3 SE , 
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A segment is here defined as all coefficients in the transform domain that belong to a certain object 
or the background. The segment can then be divided further into subsets. 
A subset is here defined as a number of coefficients in a part of the transform domain (e.g. a 
subband in the wavlet transform case) which is needed for reconstruction and belonging to a 
segment in the digitized image, see Figure 3. 

When this classification is made the segments are coded independently to different levels of 
accuracy which will yield a bit stream for every segment. These bit streams are then mixed 
together in som 

5,2 Descrflpiioo of operatBoo (Dettaiils) 

The proposed method works in the encoder 2is follows on a digitized image, see Figure 1: 

1. Perform a forward transformation on the image to be transmitted. 

2. When the information about how to divide the digitized image into objects and background, 
a mask is created, by using for example the technique described in P08512 and 
P08983SE, in the transform domain, describing which coefficients are needed to reconstruct 
the different objects or the background. 

3. Use the mask, classify the transform coefficients into segments. 

4. Code the segments independently. This will give the number of bits required for each 
subset 

5. Concatenate the bit streams together with stream and header information needed. 

This require a bit stream description which can be found below. 

6. Send the concatenated bit stream. 

The method makes it possible for the receiver to have random access to those parts in the image 
that is needed, see Figure 2, since the information on where in the stream to find the different 
parts is known. 

At the decoder one way of working could be, see Figure 2: 

1. Receive the bit stream and decode the header information needed. 

2. Find and decode the segment information needed. 

3. Create a mask, by using for example the technique described in P08512 and P08983SE, 
in the transform domain, describing which coefficients are needed to reconstruct the 
wanted objects or the background. 

4. Decode the needed segment data from the bit stream. 

5. Reconstruct the needed segments. 

6. Show image. 

Baa Stream DescriipUion 

In this section the components of a bit stream that could be needed for the technique is 
presented. 

Data Structures anc9 Pointers 

o Pointer 

A pointer is a set of symbols that defines the position of a bit or byte in a bit stream or a file. 
Many ways of describing pointers have been defined in computer science. Any suitable such 
method can be used here. A pointer can be implicidy defined by a specific bit stream 
composition rule. A pointer can be defined relative an explicidy or implicidy defined 
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position. A simple way of defining a pointer is to define the number of bits between the 
required position and a known reference point such as e.g. the first bit in the bit stream. 

o Topology description 

The topology descriptor, TOP, is a set of symbols that defines die topological relation between 
the objects and the enumeration of the objects and shapes. This is illustrated in Figure MJl 
where four objects Ol, 02, 03, 04 and four shapes SI, S2, S3 and S4 are shown. The 
topology of the image can e.g. be represented as a tree graph as shown in Fig. MJ2. The 
nodes and edges of the tree graph can be coded in a data structure using well known 
methods. p_TOP is a pointer to a topology descriptor 

o Shape description 

A shape descriptor, Sj, defines the shape of a closed boundary of an object. The shape number, 
i, is given by a topology descriptor. Many different shape coding techniques can be used. 
Examples of such methods are chain coding [REF] and the shape coding method in MPEG- 
4 [REF]. Shape descriptors can be decoded independently once their position in the bit 
stream is known. p_ S; is a pointer to a shape descriptor. 

o Segment description 

An segment descriptor, Ti.is a compressed set of symbols that encodes an segment as described 
above. The segment contains an ordered set of subsets. The object number, i, is given by a 
topology descriptor. p_ T; is a pointer to a segment descriptor. 

o Subset description 

A subset descriptor, Bij, is an independendy decodable subset, j, of a segment descriptor, T; , 
that describes e.g. die coefficients that belongs to a given subband, j, as described above. p_ 
Bij is a pointer to a subset descriptor, 

o Multiplexed segment description 

Several segment descriptions, { Ti, Tj, Tt. . . } , can be multiplexed into a joint data structure MT(i, j, 
k). This is typically done for the purpose of simultaneous progressive transmission of a set of 
objects. The data structure, MT, is called a multiplexed segment descriptor. Several 
multiplexing metiiods can be used. p_ MT is a pointer to a multiplexed segment descriptor. 

o Se gment multiplexing methods 

Examples of multiplexing methods are shown in Fig. 4. A simple method is to 
interleave the subsets belonging to the component segments so that; 

MT(i, j. k) = { Bio, Bjo, Buo, Bii, Bj,, Bui, Bi2, Bjz. Bu, • . ..}, 



Where the order of the symbols corresponds to the order in die bit stream with symbols 
to the left being sent first. Subsets in a multiplexed stream may be excluded if diey are 
known by the decoder. 

BU stream storage format 

In order to achieve random access of any image object die stored bit stream or fde structure 
shoiild include at least the following components: 

In the image header if needed: 

Topology descriptor TOP 

Pointers to shape descriptors: { p_ Si, p_S2, P„Sn } 
Pointers to segment descriptors: { p_ To, p„Ti, ... P_Tn } 
Optionally pointers to subset descriptors: for each k = [0,N] 
{p_Bko, p_Bki, ... p_BkN } 
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In the body of the stored bit stream if needed: 
Shape descriptors: {Si, S2. •• Sn } 
Segment descriptors: {To, Ti, ... Tn } 

A group of segment descriptors with index {k, 1 . m . . .} can optionaUy be replaced with 
a multiplexed segment descriptor MT(k, 1, m. . .) 
N is the number of stored objects. The bacl^ound is the object with index 0. 

PirogressDve transmission of objects with random access 

A server is receiving a request for sending image data to a client. The image is stored at the server 
in the format described in section. Some of the stored data structures (topological information, 
shapes, segments and subsets) may already have been sent to the receiving terminal. This section 
describes a procedure for composing a bit stream at die server that is serving the request. 

Examples : 
CDient request 

A primitive request contains the following informadon: 

Send objects with numbers k, 1. m ... to accuracy nw. n,, n„ respectively, where die accuracy is the 
index of the highest subset that will be sent for each index. 

Several primitive requests can be sent. They wiU be served in the order diat they are received or 
in an order that is otherwise specified. 

Procedure for serving a request (details) 

Send topological information if needed. TOP is sent in response to the first request of 
information about an image. 

Send all shapes that are necessary for describing the boundary of the requested objects. 
Shapes that already are known to the decoder need not to be sent. Using die topological tree m 
figure MI2 we find that aU shapes on the same branch as die object at the same or a lower 
hierarchical level need to be sent. The server knows the state of die decoder and will only send 
shapes that not are known at the decoder. 

Send (multiplexed) subset descriptors that describes the requested objects to the defined 
accuracy. Subset descriptors diat already are known to die decoder need not to be sent The 
cUent knows e.g. the subsets { Bw. Bu„ Bl2. B^} of segment k. The subset descriptors 
{ Bks. Bias, Bw} need to be sent if object k is requested to accuracy 7. 

5.3 Examples , , , . , • j 

In diis section some examples of situations where die proposed mediod can be used is explained. 
Assume diat diere is a region in die middle of die image diat has die shape of a drde that has to 
have a better quality dian die region outside die circle, hencefordi called die badcground. Bodi 
the background and die region should however be transmitted simultaneously. The foUowmg 
then takes place: 

y The original image is transformed widi a wavlet transform. 

✓ A mask in the transform domain is dien created. This mask describes which coefficients is 
needed in die transform domain in order to reconstruct the region and die background. 
The created mask is dien used to classify die coefficients in the transform domain into two 
segments. One for die region and one for die background. 

The two segments is built up by a number of subsets. The number of subsets are, in diis 
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example, the^^^ as the number of subbands in the transfomi domaijii.'pius pfe^cr.t . . : 

situation is then: 

o For the region segment: 

{{fq I , Tq2 » ■ • • » ^0.1 J • * • » l^DO .subbands. 1 ' ^do .subbands. 2 f'"> ^no . subbands .j JJ '^^^^ ^1 

number of coeffdents in the different subsets, 
o For die background segment 

11^0.1 » ^0.2 » " • • » ^O.p i * • • ' {^^ no _ subbands. I » ^ no .subbands, 2 ' ' • ' > ^ no _ subbands .q JJ ^^^^ P»9 

the number of coeffdents in the different subsets. 
^ The two segments are then coded yielding the following: 

o For the region segment: 

A shape descriptor Sr and a segment descriptor Tr= {B,,o, Br.i, . - Br^o_subband$} and a set 

of subset pointers {p„Br.O, p„Br,l, p^Br^^subbinds}- 

o For the background segment 

A segment descriptor Tb= {Bb.o, Bb.i, .... Bb^o^bb«xds} and a set of subset pointers 

{P— Bb.O, p_-Bb.l , . . - , p_Bb^o_$ubb«ids} - 

^ The two segments are then mixed togedier into a single bitstream as follows: 

0<image headerxTOPx Sr><{p_Bb.o. P_Br.o. P_Bb,i. P-Br.i. P_Bb.rvo_subbands» 
P„Br.no_subbands}=^<MT{b,r)= {Bb.o. Bf.o. Bb.i. B^.i Bb.no_subbands. Br.no_subbands}> 

In this case the subsets is mixed as shown in the top part of Figure 4. 

Note that in the case where the receiver knows the order of how the different parts of the 
image is sent, the TOP field is not needed. 

The first part of the array, frofa <image header> to ...p_B...}> 
is in other words a definition of where the different image 
regions are placed in the rest of the compressed bit stream 
<MT(b,r) = {...B...}>. 

^ The mbced bit stream is then sent to the receiver. 

At the decoder side the following will happen: 

The image header togther with the topology, shape informadon and the pointers will be read. 

^ The decoder can now create the same mask as above. 

The decoder creates the segments with the underlying subsets. 

O Th^ decoder starts decoding the mixed bit stream and fills in die transmitted transform 

coeffidents in the corresponding subsets. 

^ A inverse transform is applied. 

^ The image is sent and reconstructed. 

This is one way of using the proposed mediod. Other ways can be to mix the bit streams in a 
different way. For example, die region can be transmitted first followed by die background. 
• Another example can be that diat diere are more dian one region where they are mixed m a 

; number of ways. 

5.4 Advamtages 

• . The proposed method has the following advantages: 

- ; o taking care of multiple regions of interest 

: o dassifying the coeffidents into segments that is coded independendy. 

' - o possible to only send die shape information when needed. 

- - o the ability to have random access in die bit stream to the parts of the image which is in some 

- way vital to the user without having to decode the whole imajge. 



CIiAIMS 

L. A method in transmission of an image between a 
transmitter and a receiver, the method comprising the 
steps of: 

- partitioning of the image into at least two image regions; 
and 

- coding the image regions into a coded symbol stream, the 
coding using a symbolic representation and having 
predetermined levels of accuracy in the image regions; and 

- compressing the coded symbol stream into a compressed bit 
stream, 

characterized in that the method comprises the steps of: 

- generating a definition of the different image regions in 
the compressed bit sream; 

- transmitting said definition to the receiver; 

- transmitting the compressed bit stream to the receiver; 
and 

- decoding in the receiver predetermined parts of the 
compressed bit stream with the aid of said definition. 



2. An apparatus for transmission of an image comprising: 
- a transmitter and a receiver; 

.ans for patitioning of the image into at least two image 



regions; 
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a coding device for coding the image regions into a coded 
symbol stream, the coding device using a symbolic 
representation and having predetermined levels of accuracy 
in the regions; 



5 - a compressing device for compressing the coded symbol 
stream into a compressed bit stream; and 

- means in the transmitter for transmitting said compressed 
bit stream to the receiver, 

characterized in that the apparatus also comprises: 

10 - means for generating a definition of the different image 
regions in the compressed bit stream; 

- means in the transmitter for transmitting said definition 
to the receiver; 

- decoder in the receiver for decoding predetermined parts 
15 of the compressed bit stream with the aid of said 
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En bild (3) , som foreligger i digitaliserad form, skall 
overforas pa en kanal mellan en sandare och en mottagare. 
Kanalen har begransad bandbredd och' bilden bar dels en 
mindre viktig bakgrund (Rl), dels omraden av sarskild vikt, 
intresseregioner (R2, Rn) . Bilden transf ormeras till 
transf ormkoef f icienter och komprimeras (21) och en mask, 
svarande mot regionerna (Rl/ R2, Rn) , definieras i 
transf ormdomanen (22) . Transf ormkoef ficienterna klassi- 
ficeras (23) och hanfores enligt maskdef initionen till olika 
segment ( SGI , SG2 , SGn) . Dessa kodas (24) oberoende av 
varandra till olika grad av exakthet beroende pa hur viktig 
motsvarande region (Rl, R2, Rn) i bilden (3) ar. Kodningen 
ger delbitstrommar (25) vilka lankas samman (26) med 
bildhuvudinformation (271,272) till en bitstrom (27) som 
sandes till mottagaren. Denne avkodar bildhuvudet och 
segmentinf ormationen samt aterskapar masken i 

transf ormdomanen, innefattande form och lagen pa regionerna 
(Rl, R2, Rn) . Bilden aterskapas sedan med hjalp darav till 
onskad noggrannhet i respsektive region. Flera regioner 
(R2,Rn) med olika grader av bildkvalite kan definieras och 
endast intressanta delar av bilden behover avkodas. 



Publiceringsf igur : Figur 2 



• the receiever does not even have to receive the whole bit stream belonging to a certain region 
or the background if a progressive coding method is used. 
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in P08512 and P08983SE 



Create mask of segments 
in the transform domain 
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A set of bit 
streams. 
One for 
each 
subset. 




Create the bit stream by 
concatenatenating the different bit 
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Send the concatenated bit stream 
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Example of concatenation 



Figure J: The method of how to code different regions in the encoder. 
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Extract header 
information from the 
stream. 



I 




Find the segment 
information in the 
stream. 



Create mask of 
segments in the 
transform domain. 



Decode the needed 
segment data from the 
bit stream. 



Reconstruct the needed 
segments. 



Decoded image 



Figure 2: A method for extracting the needed segments from the bit stream. 
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One possible decomposition 




Segmentl _ subset = {a, a„ } 



Segmen 1 1 _ subsets = [{a » a „ }, {a j , a j ^ } . . . . {a a ^ }J 



SegmentZ _ subset = {b , bn, } 



Segment 2 _ subsets = [{b,, bt„ } {b^ } . . . . {b., b^ ]| 



jy/fff^ „= the number of transform coefficients belonging to segment /, n-i-^j-^k, m- the number of 
transform coefficients belonging to segment 2 andm—o-^p+q. 



Figure 3: Classification of transform coefficients into subsets in the case of two segments. 
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Figure 4: Different ways of concatenating the segments into the bitstream. 
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